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Abstract: This paper introduces a flexible Bayesian nonparametric Item Response 
Theory (IRT) model, which applies to dichotomous or polytomous item responses, and which 
can apply to either unidimensional or multidimensional scaling. This is an inflnite-mixture 
IRT model, with person ability and item difficulty parameters, and with a random intercept 
parameter that is assigned a mixing distribution, with mixing weights a probit function of 
other person and item parameters. As a result of its flexibility, the Bayesian nonparametric 
IRT model can provide outlier-robust estimation of the person ability parameters and the 
item difficulty parameters in the posterior distribution. The estimation of the posterior 
distribution of the model is undertaken by standard Markov chain Monte Carlo (MCMC) 
methods based on slice sampling. This mixture IRT model is illustrated through the analysis 
of real data obtained from a teacher preparation questionnaire, consisting of polytomous 
items, and consisting of other covariates that describe the examinees (teachers). For these 
data, the model obtains zero outliers and an R-squared of one. The paper concludes with a 
short discussion of how to apply the IRT model for the analysis of item response data, using 

menu-driven software that was developed by the author. 

^This material is based upon work supported by National Science Foundation grant SES-1156372 from 
the Program in Methodology, Measurement, and Statistics. The first author gives thanks to Wim J. van der 
Linden and Brian Junker for feedback on this work. 
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1 Introduction 


Given a set of data, consisting of person’s individual responses to items of a test, an item 
response theory (IRT) model aims to infer each person’s ability on the test, and to infer 
the test item parameters. In typical applications of an IRT model, each item response is 
categorized into one of two or more categories. For example, each item response may be 
scored as either correct (1) or incorrect (0). From this perspective, a categorical regression 
model, which includes person ability parameters and item difficulty parameters, provides 
an interpretable approach to inferring from item response data. One basic example is the 
Rasch (1960) model. This model can be characterized as a logistic regression model, having 
the dichotomous item score as the dependent variable. The predictors (covariates) of this 
model include N person indicator (0,1) variables, corresponding to regression coefficients 
that dehne the person ability parameters; and include I item indicator (0,-1) variables, 
corresponding to coefficients that dehne the item difficulty parameters. 

In many item response data sets, there are observable and unobservable covariates that 
inhuence the item responses, in addition to the person and item factors. If the additional 
covariates are not fully accounted for in the given IRT model, then the estimates of person 
ability and item difficulty parameters can become noticeably biased. Such biases can be 
(at least) partially-alleviated by including the other, observable covariates into the IRT 
(regression) model, as control variables. However, for most data collection protocols, it is 
not possible to collect data on all the covariates that help determine the item responses 
(e.g., due to time, hnancial, or ethical constraints). Then, the unobserved covariates, which 
inhuence the item responses, can bias the estimates of the ability and item parameters in an 


2 



IRT model that does not account for these covariates. 


A flexible mixture IRT model can provide robust estimates of person ability parameters 
and item difficulty parameters, by accounting for any additional unobserved latent covariates 
that influence the item responses. Modeling flexibility can be maximized through the use of 
a Bayesian nonparametric (BNP) modeling approach. 

In this chapter we present a BNP approach to inhnite-mixture IRT modeling, based on 
the general BNP regression model introduced by Karabatsos and Walker (2012). We then 
illustrate this model, called the BNP-IRT model, through the analysis of real item response 
data. The analysis was conducted using a menu-driven (point-and-click) software, developed 
by the author (Karabatsos 2014a, 2014b). 

In the next section, we give a brief overview of the concepts of mixture IRT modeling, 
and BNP inhnite-mixture modeling. Then in Section 3, we introduce our basic, BNP-IRT 
model. This is a regression model consisting of person ability and item difficulty param¬ 
eters, constructed via the appropriate specihcation of person and item indicator predictor 
variables, as mentioned above. While the basic model assumes dichotomous item scores 
and unidimensional person ability, our model can be easily extended to handle polytomous 
responses (with item response categories not necessarily ordered), extra person-level and/or 
item-level covariates, and/or multidimensional person ability parameters. In Section 4, we 
describe the Markov chain Monte Carlo (MCMC) methods that can be used to estimate the 
posterior distribution of the model parameters. (This is a highly technical section which can 
be skipped when reading this chapter). In Section 5, we describe methods for evaluating 
the ht of our BNP-IRT model. Section 6 provides an empirical illustration of the BNP-IRT 
model through the analysis of polytomous response data. The data were obtained from an 
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administration of a questionnaire that was designed to measure teacher preparation. Section 
7 ends with a brief overview of how to use the menu-driven software to perform data analysis 
using the BNP-IRT model. That section also includes a brief discussion of how to extend 
the BNP-IRT model for cognitive IRT. 

The remained of this chapter makes use of the following notational conventions. Let 
U = {Ui,... ,Ui,..., UiY denote a random vector for the scores on a test with / items. 
A realized value of the item response vector is denoted hy u = (ui, We 

assume that each item i = 1,..., J has rrii + 1 possible discrete-valued scores, indexed by 
M = 0,1,..., rrij. 

We use lower cases to denote a probability mass function (pmf) of a value u discrete 
random variable (or vector, u) or a probability density function (pdf) of a value u of a con¬ 
tinuous random variable (or u), such as f{u) or /(u), respectively. The given pmf (or pdf) 
f{u) corresponds to a cumulative distribution function (cdf), denoted by upper case F{u), 
which gives the probability that the random variable U does not exceed u. F{u) is sometimes 
more simply referred to as the distribution function. Thus, for example, N(/i, a^), U(0,&), 
IG(a, b) and Be(a, b) (or cdfs N(-1 /i, a^), U(-1 0, 6), IG(- | a, b) and Be(- \a,b), respectively), 
denote the univariate normal, uniform, inverse-gamma, and beta distribution functions, re¬ 
spectively. They correspond to pdfs n(- | n, a^), u(-1 0, 6), ig(- \a,b), be(- \a,b), with mean 
and variance parameters (/i, a^), minimum and maximum parameters (0,6), shape and rate 
parameters (a, 6), and shape parameters (a, 6), respectively. Also, if /3 is a realized value of 
a iL-dimensional random vector, then N(/3|0,V) denotes the cdf of the multivariate (K- 
variate) normal distribution with mean vector of zeros 0 and K x K variance-covariance 
matrix V, distribution function n(0, V), and corresponding to pdf n(/3 | 0, V). The pmf or 
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pdf of u given values of one or more variables x is written as /(n | x) (with corresponding cdf 
F{u I x)); given a vector of parameter values is written as f{u \ <^) (with corresponding cdf 
F[u I 0), and conditionally on variables and given parameters is written as f{u \ x; <^) (with 
corresponding cdf F{u \ x; ^)). Also, ~ means ’’distributed as”, ^ind means ’’independently 
distributed,” and r^ud means ’’independently and identically distributed.” For example, U ~ 

F, U -^nd F{u), F ~ F(n I x; C), t/ ~ F(m I C), ^7 r^ud F(C), (3 - N(0, V), or cr^ ~ IG(a, b). 

The preceding notation may replace U by U, replace F by /, replace N by n, and/or replace 
IG by ig. 

2 Mixture IRT and Bayesian Nonparametrics 

For any given vector of item response data u = (mi, uiY, a discrete-mixture IRT 

model admits the general form 

/gx(^|x)= //(n|x;/3, ^'(x))dGx(^') = ^ /(w|x;/3,^'j(x))a;j(x). (1) 

i=i 

conditionally on any given value of a vector of any covariates x. In this expression, /(u | x; f3, ^'(x)) 
is the kernel of the mixture, and Gx is a mixture distribution that may (or may not) depend 
on the same covariates. 

Also, as show in ([1]), this pmf is based on a mixture of J pmfs /(nix; /3, ^'j(x)), j = 

1,..., J. Here, /3 is a vector of (any available) hxed parameters that are not subject to the 
mixture, the ^'j(x), j = 1,..., J, are random parameters that are subject to the mixture 
that may be covariate dependent, and J is the number of mixture components. In addition. 
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Ci;j(x), j = 1,..., J, are mixture weights that sum to one for every given covariate value 
X G A". The mixture model ([T]) is called a discrete (continuous) mixture model if Gx is 
discrete (continuous); it is called a hnite (inhnite) mixture model if J is hnite (inhnite). 

A simple example is given by the hnite mixture Rasch model for dichotomous item scores 
(Rost, 1990, 1991; von Davier & Rost, vol. 1, chap. 23), which assumes that 


/(w|x;/3, ^'j(x)) = Yl 

i=l 


exp{9j - 

1 + exp(0j - 13 ’ 


( 2 ) 


with a hnite number of J components and mixture weights that are not covariate-dependent 
(i.e., Ci;j(x) = cjj, j = 1,..., J < cx)). The ordinary Rasch (1960) model for dichotomous 
item scores is the special case of the model dehned by ([I]) and (Ej) for J = 1. 

An inhnite-mixture model is given by ([1]) for J = oo. A general BNP inhnite-mixture 
IRT model assumes that the mixture distribution has the general form 


= ( 3 ) 

i=i 

where 5^(-) denotes a degenerate distribution with support This Bayesian model is com¬ 
pleted by the specihcation of a prior distribution on {^j(x)}j=i_ 2 ,..., {i^j(x)}j=i, 2 ,..., and (3 
with large supports. 

A common example is a Dirichlet process mixed IRT model, which assumes that the 
mixing distribution is not covariate-dependent (i.e., Gx(-) = G(-)), along with a random 
mixing distribution G{-) constructed as G{-) = where Uj = Vj ni=i(l “ '^k) 

for random draws Vj Be(l, a) and ^nd Gq, for j = 1, 2,.... Here, G is a Dirichlet 
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process (DP), denoted G ~ DP (a, Gq), with baseline parameter Gq and precision parameter 
a (Sethuraman, 1994). The DP(a,Go) has mean (expectation) E[G(-)] = Go(-) and variance 
V[G(-)] = Go(-){l - G'o(-)}/(« + 1) (Ferguson, 1973). 

An important generalization of the DP prior includes the Pitman-Yor (Poisson-Dirichlet) 
prior (Ishwaran & James, 2001), which assumes that Vj ~jjrfBe(aij, a 2 j), for j = 1, 2,..., for 
some aij = 1 — ai and a 2 j = 02 + jai with 0 < ai < 1 and 02 > —ai. The special case 
dehned by ai = 0 and a 2 = a results in the DP(aGo). 

Another important generalization of the DP is given by the Dependent Dirichlet process 
(DDP) (MacEachern, 1999, 2000, 2001), which provides a model for the covariate-dependent 
random distribution, denoted Gx. The DDP model assumes that Gx ~ DP(ax,Gox), 
marginally for each x. Specihcally, the DDP dehnes a covariate-dependent random distribu¬ 
tion Gx of the form given in equation ((31), and incorporates this dependence either through 
covariate-dependent atoms ^'j(x), a covariate-dependent baseline Gqx, and/or covariate- 
dependent stick-breaking weights of the form Wj/x) = Vj (x)nU(i -r;fc(x)), for j = 1,2,.... 
For example, the ANOVA-linear DDP (De lorio et ah 2004), denoted Gx ~ ANOVA- 
DDP(a,Go,x), constructs a dependent random distribution Gx(-) = (•)) via 

covariate-dependent atoms along with /3 ~ G and G ~ DP («,Go(/3)). 

Many examples of DP-mixture and DDP mixture IRT models can be found in the lit¬ 
erature (Qin, 1998; Duncan & MacEachern, 2008; Miyazaki & Hoshino, 2009; Farina et al. 
2009; San Martin et al. 2011; San Martin, et ah, 2011; Karabatsos & Walker, 2012). 
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3 Presentation of the Model 


The BNP-IRT model is a special case of the Bayesian nonparametric regression (inhnite- 
mixture) model introduced by Karabatsos and Walker (2012). These authors demonstrated 
that the model tended to have better predictive performance relative to DP-mixed and DDP 
mixed regression models. As will be shown, the BNP-IRT model is suitable for dichotomous 
or polytomous item responses. 

First, we present the basic BNP-IRT model for dichotomous item responses. Let T> = 
denote a set of item-response data, including dichotomous responses Upi G 
{0,1}. Also, Xpj denotes a covariate vector that describes person p = and item 

i = 

The basic BNP-IRT model is dehned as 


p I 


fiV 1 X; C) 

/ iPpi 1 Xpj, 

P=12=1 

(4a) 

f iPpi 1 Xpj, 

= P{Up, = 11 Xpp C)"-[1 - P{Up, = 11 Xpp 

OO 

(4b) 

Pr(R 

= 1 1 x; c) = 1 - ^*(0 1 x; C) = y /(Wpi | Xpp C)dM* 

(4c) 


0 







rsj 


rsj 


rsj 


/ OO 

E 

j=-oo 

0 


Fj + Xpi/3, cT^)d'u* 


\ (Tlo J V J 

|0,cTj)U(ap |0,6^p) 

N(/3 I 0, a^ndiag(cx), \ 0, alv^lNi+i) 

lG{a^ I ao/2, ao/2)IG(E I aa;/2, a^/2). 


(4d) 

(4e) 

(4f) 

(4g) 

(4h) 



Under the model, the data likelihood is given by equations fHa|) - fHe|) given parameters C = 
{n, cr^, l3, /3^, (Ttj) with fi = By default, the model assumes that x^j is a binary 

indicator vector with NI + 1 rows, having constant ( 1 ) in the hrst entry, a “ 1 ” in entry p + 1 
to indicate person p, and “— 1 ” in entry i + {p+l) to indicate item i. Specihcally, each vector 
Xpj is dehned by 

Xpi = (1, l(p = 1 ),..., l(p = N),-l{i = 1),..., -l{i = /))^ 

where !(■) denotes the indicator (0,1) function. Then, in terms of the coefficient vector 
/3 = (/5 q, ..., /dp/), each coefficient /dp_,_/ = 6p represents the ability of person p = 1,..., P. 

Likewise, each coefficient represents the difficulty of item i = 1,..., J. The covariate- 

dependent mixture weights a.;j(x) in fHe)l are specihed by a cumulative ordered probits re¬ 
gression, based on the choice of a standard normal cdf for $ (■) with latent mean x^/d^ 
and variance cr^, for the ’’ordinal categories” j = 0 , ± 1 , ± 2 ,..., where coefficient vector /3^ 
contains additional person parameters and item parameters. 

As shown in (I4fj) - (l4hl) . the Bayesian model parameters ^ have joint prior density 

OO 

7r(C) = n I 0,aj)u(crp I 0,6<^p)n(/3 I 0,cr^diag(cx),nJ]v/)) (5a) 

j = -QO 

xn(/3^ I 0, a^'y^I/v/+i)ig(cr^ | ao/ 2 , ao/ 2 )ig(cr^ | a^/ 2 , a^/ 2 ), (5b) 

where denotes the vector of NI ones, and Iatz+i is the identity matrix of dimension 
NI -|- 1 . As shown in ([5]), the full specihcation of their prior density relies on the choice 
of the parameters ( 60 -/ 1 , n, oq, ^'^u, a^,). In Section 6 , where we illustrate the BNP-IRT model 
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through the analysis of a real item response data set, we suggest some useful default choices 
for these prior parameters. 

As shown by the model equations in fHaD - fl4d|) . the item response function Pt{U = 11 x; 
is modeled by a covariate(x)-dependent location mixture of normal distributions for the 
latent variables u*^. The random locations fij of this mixture corresponds to mixture weights 
a;j(x), j = 0, ±1,±2, .... Conditionally on a covariate vector, Xp* and model parameters, 
the latent mean and variance of the mixture can be written as: 


OO 

E[Cp* I Xpp /3, (3^, a^, a^] = hpi = + xh/3)cnj(xpi;/3^, a^), 

j=-oo 

OO 

V[t/p* I Xpp /3, /3^, a^] = ^ + xT/3) - + cr^}a;j(xpi;/3^, cr^), 

j=-oo 

respectively (Marron & Wand, 1992). 

The BNP-IRT model can be viewed as an extension of the DP-mixed binary logistic 
generalized linear model (Mukhopadhyay & Gelfand, 1997). In terms of the responses u, the 
extension can be written as 


/(« 1 x) 

exp(/rj -h xT/3)“ 

^ 1 + exp(/i^. + xT/3) 

Uj 

= Wni=l(l-^A:) 


~ Be(l,a), j = 1,2,... 


~ N(0,aj), j = l,2,... 

P 

~ N(0, S^). 
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This model thus defines a mixture of logistic cdfs for the inverse link function, with weights 
ojj that are not covariate-dependent. In contrast, as shown in flTc|) - (l4d|) , the BNP-IRT model 
in (jl]) is based on a mixture of normal cdfs for the inverse link function. The BNP-IRT model 
is more flexible than the DP model, because the former uses covariate-dependent mixture 
weights, as shown in fHe|) . 

In other words, if = 0 for all j, then the BNP-IRT model reduces to the Rasch IRT 
model with ’’normal-ogive” response functions; all items are assumed to have common slope 
(discrimination) parameter that is proportional to l/a. Nonzero values of fij, along with the 
covariate-dependent mixture weights a;j(x;/3^, cr^^), for j = 0,±1,±2,..., allows the BNP- 
IRT model to shift the location of each response function across persons and items. Value of 
yUj > 0 {fij < 0) shifts the response function to the left (right). The BNP-IRT model allows 
for this shifting in a flexible manner, accounting for any outlying responses (relative to a 
normal-ogive Rasch model). This feature enables inferences of person and item parameters 
from the BNP-IRT that model are robust against such outliers. 

According to Bayes’ theorem, a set of data V updates of the prior probability density 
7r(^) in ([5]) leads to posterior probability density 




f{V\X- C)7r(C) 

I /(D|X; C)^(C)ciC 


Also, conditionally on (xpi,D), the posterior predictive pmf and the posterior expectation 
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(E) and variance (V) of the item response Upi are given by 


f{upi\y.pi,V) 

I Xpi, V] 
N[Upi I Xpi, V] 


j f{upi I Xpi; C)7r(C I I^)dC, 

f{Upi = 11 Xpj, P) = /(I I Xpj, V), 

/(l|Xp„P)[l-/(l|Xp„P)], 


( 6 ) 

( 7 ) 

( 8 ) 


respectively. 

It is straightforward to extend the BNP-IRT regression model to other types of response 
data by making appropriate choices of covariate vector x (corresponding to coefficients 
f3,f3^). Such extensions are described as follows: 

1. Suppose that for each item i = 1,..., / the responses are each scored in more than 
two categories, say rrij + 1 nominal or ordinal categories denoted as u' = 0,1,..., rrij, 
with u' = 0 the reference category. Then the model can be extended to handle such 
polytomous item responses using the Begg and Gray (1984) method. Specihcally, the 
model would assume the response to be dehned by Upi = l(Mh > 0) each covariate 
vector Xpi to be dehned by a binary indicator vector: 

Xpi = (1, l{p = 1),..., l(p = N), l{i = l)l(Mpi = 1),..., l(i = I)l{u'pi = 1),..., 

= l)l(Mh = mi),...,l(i = /)l(Mh = mi))h 

Then in terms of coefficient vector /3 = (/5 q,/ 9;^, ...,/Si+p+m*/), coefficient (^ij^p = Op, 
p = 1,..., P, represents the latent ability of person p and the coefficient 
represents the latent difficulty of item i = 1,..., / and category u = 1,... ,m*, where 
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m = maxj rrii. 


2. If the data has additional covariates {xi,... ,Xq) which describe either the persons 
(e.g., socioeconomic status), test items (e.g., item type), or type of response (e.g., 
response time), associated with each person p and item i, then these covariates can 
be added as the last q elements to each of the covariate vectors Xp*, such that Xpj = 
(..., Xu ,..., XqiY, p = 1,..., P and i = 1,..., I. Then, specihc elements of coefficient 
vector f3, namely the elements I3j^, k = dim(/3) — g + 1,..., dim(/3), would represent 
the associations of the q covariates with the responses. 

3. Similarly, suppose that given test consists of measuring one or more of D < I measure¬ 
ment dimensions. Then we can extend the model to represent such multidimensional 
items, by including D binary (0,1) covariates into the covariate vectors Xpj, p = 1,... ,P 
and i = 1,..., /, such that the first set of elements of Xpj dehned by 

Xpi = {l,l{p = = 1),... ,l{p = N)l{di = 1),... ,l{p = l)l{di = D),..., 

l(p = N)lid, = D),...y, 

where di G {1,... ,D}, denotes the measurement dimension of item i. Then specihc 
elements of the coefficient vector /3, namely the elements for k = 2,..., ND + 1, 
indicate each person’s ability on dimension d = 1,... ,D. 
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4 Parameter Estimation 


By using latent-variable Gibbs sampling methods for Bayesian infinite-mixture models (Kalli 
et al. 2011), it is possible to conduct exact MCMC sampling from the posterior distri¬ 
bution of the BNP-IRT model parameters. More specihcally, introducing latent variables 
{Upi,Zpi G Z,G K)Arx7 and a hxed decreasing function such as = exp(—/), the condi¬ 
tional likelihood of the BNP-IRT model can be written as 

p I 

n n I a^). (9) 

p=l i=l 

For each {p,i), after marginalizing over the latent variables in dH]) we obtain the original 
model likelihood f{upi | Xp*; in (jTaj). Importantly, conditionally on the latent variables, 
the inhnite-dimensional BNP-IRT model can be treated as a hnite-dimensional model, which 
then makes the task of MCMC sampling feasible (of course, a even a computer cannot handle 
an inhnite number of parameters). Given all variables, save the latent variables (^i)”=i, the 
choice of each Zi has hnite maximum value iNmax, where iV^ax = maxp[maxj{maxj I(;Upj < 

e,) I Jill- 

Then standard MCMC methods can be used to sample the full conditional posterior 
distributions of each latent variable and model parameter repeatedly for a sufficiently large 
number of times, S. If the prior 7r(C) is proper (Robert & Casella, 2004, sect. 10.4.3), then, 
for S' —)■ CX3, this sampling process constructs a discrete-time Harris ergodic Markov chain 


{((ftp?), (4i^), (^pf^),C^"^ = {{Upi), 
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which, upon after marginalizing out all the latent variables {Upi), (Zpf), has the poste¬ 

rior distribution n(<^ | Vn) as its stationary distribution (for dehnitions, see Meyn & Tweedie, 
1993; Nummelin, 1984; Roberts & Rosenthal, 2004). (The next paragraph provides more 
details about the latent variables, 

The full conditional posterior distribution are as follows: the one of is u(Up^ | 0, ■Cppii); 
Mpj has a truncated normal distribution; the one of Zpi is a multinomial distribution in¬ 
dependently for p = 1,..., P and i = 1,..., /; the full conditional distribution of jUj is 
a normal distribution (sampled using a Metropolis-Hastings algorithm), independently for 
j = —iVmax, • • •, fV max ; CTp Can be sampled using a slice sampling algorithm involving a 
stepping-out procedure (Neal, 2003); the one j3 is multivariate normal distribution; and 
the full conditional posterior distribution of is inverse-gamma. Also, upon sampling of 
truncated normal latent variables Zp^ that have full conditional densities proportional to 
n{z*pi\y.](3^,a^)l{zpi - I < z*p^ < Zpi), independently for p = 1,...,P and i = 1,...,/, 
the full conditional posterior distribution of l3^ is multivariate normal distribution and the 
one of is inverse-gamma distribution. For further details of the MCMC algorithm, see 
Karabatsos and Walker (2012). 

In practice, obviously only a MCMC chain based on a hnite number S can be gener¬ 
ated. The convergence of hnite MCMC chains to samples from posterior distributions can 
be assessed using the following two procedures (Geyer, 2011); (i) viewing univariate trace 
plots of the model parameters to evaluate MCMC mixing (Robert & Casella, 2004); and (ii) 
conducting a batch-mean (or subsampling) analysis of the hnite chain, which would provide 
95% Monte Carlo Conhdence intervals (MCCIs) of all the posterior mean and quantile esti¬ 
mates of the model parameters (Flegal & Jones, 2011). Convergence can be conhrmed both 
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by trace plots that look stable and ’’hairy” and 95% MCCIs that, for all practical purposes, 
are sufficiently small. If convergence is not attained for the current choice of S samples of a 
MCMC chain, additional MCMC samples should be generated until convergence is obtained. 

5 Model Fit 

The £t of the BNP-IRT model to a set of item response data, V, can be assessed on the 
basis of its posterior predictive pmf, dehned in ([6]). 

More specihcally, the £t to a given response Upi can be assessed by its standardized 
response residual 

^pi I 

Response Upi can be judged to be an outlier when \rpi\ is greater than exceeds two or three. 

A global measure of the predictive £t of a regression model, indexed bymG {1,...,M}, 
is provided by the mean-squared predictive error criterion 

PI PI 

Dim) = ^ '^{upi - ¥.[Upi I Xpi, + ^n[Upi I Xpi,m]. 

p=l i=l ^=1 

(Laud & Ibrahim, 1995; Gelfand & Ghosh, 1998). The hrst term of D{rn) measures the 
goodness-of-£t (Gof(m)) of the model to the data, while its second term is a penalty for 
model complexity. Among a set of m = 1,..., M that is compared, the model with the 
highest predictive accuracy for the data set V is identified as the one with the smallest value 
of D{m). 

The proportion of variance explained by the regression model is given by the R-squared 
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{R‘^) statistic 


E,EiEIiK*-«P 

where u = ^ Ep=i ELi “p*- 

The standardized residuals Vpi, the D{m) criterion, and can each be estimated as a 
simple by-product of an MCMC algorithm. 

6 Empirical Example 

Using the BNP-IRT model, we analyzed a set of polytomous response data obtained from 
the 2006 Progress in International Reading Literaey Study. A total of iV = 244 fourth-grade 
U.S. teachers rated their own teaching preparation level in a ten-item questionnaire (/ = 10). 
Each item was scored on a scale ranging from zero to two. 

For this questionnaire, the latent person ability was assumed to represent the level of 
teaching preparation. The ten items addressed the following areas: education level (named 
CERTIFICATE), English LANGUAGE, LITERATURE, teaching reading (PEDAGOGY), 
PSYCHOLOGY, REMEDIAL reading, THEORY of reading, children’s language develop¬ 
ment (LANGDEV), special education (SPED), and second language (SEGLANG) learning. 
The GERTIFIGATE item was scored on a scale of 0 = bachelor’s, 1 = master’s, 2 = doctoral, 
while the other 9 questionnaire items were each scored on a scale consisting of 0 = not at 
all, 1 = overview or introduction to topic, and 2 = area of emphasis. Each of the ten items 
described a type of training for literacy teachers, as prescribed by the National Researeh 
Couneil (2010). 

We considered three additional covariates for the BNP-IRT model, namely AGE level 
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(scored in nine ordinal categories), FEMALE status, and Miss:FEMALE, an indicator (0,1) 
of missing value for FEMALE status. Overall, 2,419 of the total possible 2,440 item were 
observed. Three of the 244 teachers had missing values for FEMALE, which were imputed 
using information from the observed values of all the variables mentioned above. 

Given that each of the 10 items item was scored on a polytomous scale (3 categories), and 
that we were interested in additional covariates over and beyond the person-indicator and 
item-indicator covariates, we analyzed the data using the BNP-IRT model, using extensions 
7)^1 and 7)^2 of the basic BNP-IRT model in Section 3 above. Also, the parameters of the 
prior pdf (E]) of the model were chosen as v, Oq, a^) = (1,10,1000,1, .01). 

To estimate the posterior distribution of the BNP-IRT model parameters, we ran the 
MCMC sampling algorithm in Section 4 for 62, 000 iterations. We used 12, 000 MCMC 
samples for posterior inference, retaining every hfth sample beyond the hrst 2, 000 iterations 
(burn-in) to obtain (pseudo-) independence between them. Trace plots for the univariate 
parameters displayed adequate mixing (i.e., exploration of the posterior distribution), and 
a batch-mean (subsampling) analysis of the 12, 000 MCMC samples revealed 95% Monte 
Carlo Confidence intervals of the posterior mean and quantile estimates (reported below) 
that typically had half-widths less than .2. If desired, smaller half-widths could have been 
obtained by generating additional MCMC samples. 

For the BNP-IRT model, the standardized response residuals ranged from —.21 to .20, 
meaning that the model had no outliers (i.e., all the absolute standardized residuals were 
well below two). Globally, the model fit analyses yielded criterion value D{m) = 2.76 (with 
Gof(m) = .03 and Penalty P{m) = 2.73) for the 2,419 responses in the data set. Also, the 
BNP-IRT model attained an R-squared of one. 
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Insert Figure 1 


The estimated posterior means of the person ability parameters were found to be dis¬ 
tributed with mean .00, standard deviation .46, minimum —.66, and maximum 3.68 for the 
244 persons. Figure 1 presents a box plot of the marginal posterior distributions (full range, 
interquartile range, median), for all the remaining parameters, including the item-difficulty 
parameters and the slope coefficients of the covariates AGE, FEMALE, and Miss:FEMALE. 
Parameter labels such as CERTIFICATE(l) and CERTIFICATE(2) refer to the difficulty of 
the CERTIFICATE item, with respect its rating categories 1 and 2, respectively. The most 
difficult item was REMEDIAL(2) (with posterior median difficulty of .27), and the easiest 
item was SECLANG(l) (posterior median difficulty —1.81). Also, the covariates AGE and 
FEMALE were each found to have a signihcant positive association with the rating response, 
since they had coefficients with 75% posterior intervals that excluded zero (this type of in¬ 
terpretation of signihcance was justihed by Li & Lin, 2010). The box plot also presents the 
marginal posterior distributions for all the item and covariate parameters in /3^, the mixture 
weights, and the variance parameters and 

7 Discussion 

In this chapter, we proposed and illustrated a practical and yet flexible BNP-IRT model, 
which can provide robust estimates of person ability and item difficulty parameters. We 
demonstrated the suitability of the model through the analysis of real polytomous item 
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response data. The model showed excellent predictive performance for the data, with no 
item response ontliers. 

For the BNP-IRT model, a user-friendly and menu-driven software, entitled: ’’Bayesian 
Regression: Nonparametric and Parametric Models” is freely downloadable from the authors 
website (Karabatsos, 2014a,b). The BNP-IRT model can be easily specified by clicking the 
menu options ’’Specify New Model” and ’’Binary inhnite homoscedastic probits regression 
model.” Afterwards, the response variable, covariates, and prior parameters can be selected 
by the user. Then, to run for data analysis, the user can click the ’’Run Posterior Analysis” 
button to start the MCMC sampling algorithm in Section 4 for a chosen number of iterations. 
Upon completion of the MCMC run, the software automatically opens a text output hie 
containing the results, which includes summaries of the posterior distribution of the model 
obtained from the MCMC samples. The software also allows the user to check for MCMC 
convergence through menu options that can be clicked to construct trace plots or run a 
batch- mean analyses that produces 95% Monte Carlo conhdence intervals of the posterior 
estimates of the model parameters. Other menu options allow the user to construct plots 
(e.g., box plots) and text with the estimated marginal posterior distributions of the model 
parameters or residual plots and text reports the ht of the BNP-IRT model in greater detail. 

Currently, the software provides a choice of 59 statistical models, including a large number 
of BNP regression models. The choice allows the user to specify DP-mixture (or more 
generally, stick-breaking-mixture) IRT models, with the mixing done either on the intercept 
parameter or the entire vector of regression coefficient parameters. 

An interesting extension of the BNP-IRT model would involve specifying the kernel of the 
mixture by a cognitive model. For example, one may consider the multinomial processing 


20 



tree (MPT) model (e.g., Batchelder & Riefer, 1999) with parameters that describe the latent 
processes underlying the responses. Such an extension would provide a flexible, inflnite- 
mixture of cognitive models that allows cognitive parameters to vary flexibly as a function 
of (inflnitely-many) covariate-dependent mixture weights. 
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Figure Caption 


Figure 1. For the BNP-IRT model, a box plot of the marginal posterior distributions 
of the item, covariate, and prior parameters. For each of these model parameters, the box 
plot presents the range, interquartile range, and median. 
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